-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
ENH: usecols takes input order for read_csv implementation review #61967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… complicated usecols order
False, | ||
": bool\n " | ||
"Whether usecols parameter will use order of input when " | ||
"making a DataFrame. \n This feature will be default in pandas 3.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think if this option is being introduced in 3.0 it won't be enforced until 4.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, is that the only concern with the implementation? If so I'll go ahead and apply it to other functions and update the docs.
I can also add modifying the flag to 4.0 milestones too, idk if there is a timeline for it just yet but figured better to add it while it's fresh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we said this should default to "warn"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, have it pop up and warn users that this was going to be a future change coming in 4.0. Unless you're talking about something else?
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.This is just the implementation for using usecols order for read_csv that I wanted to have people look at before moving to apply it to other places like read_excel and read_clipboard. If it all looks good, I'll go back and add all necessary documentation about future deprecation along with a popup when using usecols. This is mainly just for checking that the implementation doesn't have any glaring issues.
I ran the entire test suite just to be safe and it all looks good. The only thing the errored were some datetime tests that had nothing to do with the changes that I could find.
Oh it is also worth noting that pyarrow already uses the usecols order by default so that's probably worth adding to the documentation regardless